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(57) ABSTRACT 

A distributed (e.g., client/server) computing environment is 
described which implements protocol methodology improv- 
ing the streaming of objects, such as for distributed appli- 
cations. In particular, the methodology facilitates streaming 
of objects (e.g., Java objects) stored and managed remotely 
(e.g., objects stored and managed in relational databases) to 
clients in a highly efficient manner. The methodology may 
be implemented by extending an existing streaming meth- 
odology or protocol to include a class identifier approach for 
supporting object serialization. A Class ID (AO) serializa- 
tion is provided as a protocol for converting between a java 
object and a binary representation. AO is intended for an 
environment in which all classes ever involved in any 
serialization are known by the environment (as is often the 
case). Each class known to the environment is represented 
by a compact numeric identifier, and it is this identifier alone 
that is used to represent the class description in the serial- 
ization. A table of the class identifiers is kept at the begin- 
ning of each serialization. A simple transformation is applied 
to achieve portability, so that any ACI serialization can be 
converted to a portable serialization, a Class Descriptor 
serialization (ACD). The ACD is identical to ACI except that 
the class identifier table beginning ACI is replaced by a table 
of class descriptors. These class descriptors contain virtually 
the same information as standard (e.g., Sun) class 
descriptors, so an ACD serialization has the same portability 
characteristics as Sun serialization. In this manner, the 
present invention provides the ability to create and stream 
objects, particularly Java objects, in a manner which does 
not incur a substantial size or resource penalty. 

18 Claims, 6 Drawing Sheets 
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SYSTEM AND METHOD FOR IMPROVED examples of computer networks include local-area networks 

SERIALIZATION OF JAVA OBJECTS (LANs) where the computers are geographically close 

together (e.g., in the same building), and wide-area networks 

RELATED APPLICATIONS (WANs) where the computers are farther apart and are 

5 connected by telephone lines or radio waves. 

The present application claims the benefit of priority from Often, networks are configured as "client/server" 

and is related to the following commonly-owned U.S. appli- networks, such that each computer on the network is either 

cation: application Sen No. 60/127,653, entitled System and a "client" or a "server." Servers are powerful computers or 

Method for Improved Serializarion of Java Objects, filed processes dedicated to managing shared resources, such as 

Apr. 2, 1999. The present application is also related to the 1Q storage (i.e., disk drives), printers, modems, or the like, 

following commonly-owned U.S. application: application Servers are often dedicated, meaning that they perform no 

Ser. No. 09/233365, entitled System and Method for Seri- other tasks besides their server tasks. For instance, a data- 

alizing Java Objects in a Tabular Data Stream, filed Jan. 19, base server is a computer system that manages database 

1999 and now U.S. Pat. No. 6,356,946. The disclosures of information, including processing database queries from 

the foregoing applications are hereby incorporated by ref- J5 various clients. The client part of this client-server architec- 

erence in their entirety, including any appendices or attach- ture typically comprises PCs or workstations which rely on 

ments thereof, for all purposes. a server to perform some operations. Typically, a client runs 

a "client application" that relies on a server to perform some 

COPYRIGHT NOTICE operations, such as returning particular database infonna- 

A portion of the disclosure of this patent document 20 tioD - Often client-server architecture is thought of as a 

contains material which is subject to copyright protection. tw ^ Ue t r architecture, one in which the user interface runs 

The copyright owner has no objection to the facsimile on the client or fiontend and the database is stored on the 

reproduction by anyone of the patent document or the patent or back end ™ e busmess mles or application 

disclosure as it appears in the Patent and Trademark Office lo S lc ^8 operation of the application can run on either 

patent file or records, but otherwise reserves all copyright 25 lhe f heDt or the server (or even be partitioned between the 

rights whatsoever. two ); In a tv P lcal deployment of such a system, a client 

application, such as one created by an information service 

BACKGROUND OF THE INVENTION (IS) shop, resides on all of the client or end-user machines. 

Such client applications interact with host database engines 

The present invention relates generally to data access and 3q ( e g ? Sybase® Adaptive Server™), executing business logic 

processing in a distributed computing system and, more which traditionally ran at the client machines, 

particularly, to a system implementing methodology for More receDlly> lhe development model has shifted from 

improving data streaming of objects in distributed computer standard clien t/server or two-tier development to a three-tier 

environments. ^ or n _ti er ) 9 component-based development model. This 

Computers are very powerful tools for storing and pro- 35 newer client/server architecture introduces three well- 

viding access to vast amounts of information. Computer defined and separate processes, each typically running on a 

databases are a common mechanism for storing information different platform. A "first tier" provides the user interface, 

on computer systems while providing easy access to users. which runs on the user's computer (i.e., the client). Next, a 

A typical database is an organized collection of related "second tier'' provides the functional modules that actually 

information stored as "records" having "fields" of informa- 4Q process data. This middle tier typically runs on a server, 

tion. As an example, a database of employees may have a often called an "application server." A "third tier" furnishes 

record for each employee where each record contains fields a database management system (DBMS) that stores the data 

designating specifics about the employee, such as name, required by the middle tier. This tier may run on a second 

home address, salary, and the like. server called the database server. 

Between the actual physical database itself (i.e., the data 45 The three-tier design has many advantages over tradi- 

actually stored on a storage device) and the users of the tional two -tier or single-tier designs. For example, the added 

system, a database management system or DBMS is typi- modularity makes it easier to modify or replace one tier 

cally provided as a software cushion or layer. In essence, the without affecting the other tiers. Separating the application 

DBMS shields the database user from knowing or even functions from the database functions makes it easier to 

caring about underlying hardware -level details. Typically, 50 implement load balancing. Thus, by partitioning applica- 

all requests from users for access to the data are processed tions cleanly into presentation, application logic, and data 

by the DBMS. For example, information may be added or sections, the result will be enhanced scalability, reusability, 

removed from data files, information retrieved from or security, and manageability. 

updated in such files, and so forth, all without user knowl- i n a typical client/server environment, the client knows 

edge of underlying system implementation. In this manner, 55 about the database directly and can submit a database query 

the DBMS provides users with a conceptual view of the for retrieving a result set which is generally returned as a 

database that is removed from the hardware level. The tabular data set. In a three-tier environment, particularly a 

general construction and operation of a database manage- component-based one, the client never communicates 

ment system is known in the art. See e.g., Date, C, An directly with the database. Instead, the client typically 

Introduction to Database Systems, Volume I and II, Addison 50 communicates through one or more components. Compo- 

Wesley, 1990; the disclosure of which is hereby incorporated nents themselves are defined using one or more interfaces, 

by reference. where each interface is a collection of methods. In general, 

DBMS systems have long since moved from a centralized components return information via output parameters. In the 

mainframe environment to a de-centralized or distributed conventional, standard client/server development model, in 

environment. One or more PC "client" systems, for instance, 65 contrast, information is often returned from databases in the 

may be connected via a network to one or more server-based form of tabular result sets, via a database interface such as 

database systems (SQL database server). Well-known Open Database Connectivity (i.e., ODBC, available from 
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Microsoft Corp. of Redmond, Washington) or Java Database What is desired is a solution providing the ability to create 

Connectivity (i.e., JDBC, available from Sun Microsystems and stream objects, particularly Java objects, in a manner 

of Mountain View, Calif.). A typical three-tier environment which does not incur a considerable size or resource penalty, 

would, for example, include a middle tier comprising busi- Moreover, such a solution should preserve portability. The 

ness objects implementing business rules (logic) for a par- 5 present invention fulfills this and other needs, 

ticular organization. The business objects, not the client, SUMMARY OF THE INVENTION 
communicates with the database. 

For their part, application writers or developers like to A distributed (e.g., client/server) computing environment 

write object-oriented programs using modern object- * described which, in accordance with the present invention, 

oriented programming techniques. At the same time, in simplifies the use of objects m distributed applications or 

however, these developers prefer to have their data (i.e., the 30 other stances where transfer of objects is required. In 

data employed by the application) stored in a database particular, the invention provides an improved methodology 

having relational tables, as that is an easy way of storing and for ^ming objects (e.g., Java objects) stored and managed 

• • j * a 1 Li • u . remotely (e.g., obiects stored and managed in relational 

retrieving data. A particular problem arises when one wants j * u \\ I - ♦ ■ l* U i «= • . r\ . .u 

. , c Tj.l r i databases) to clients in a highly efficient manner. Once at the 

to retrieve data from the database for use (e.g., „ c ,. , ' , „ . » j *u 

. , . . . t . , , .,.««,, , 15 clients, the objects may be executed or otherwise manipu- 

manipulation) within one's program: how is this "flat' data laled as desired 

converted into objects. In this regard "object" refers to the ^ t invention be . lemenled b extending 

specific programming construct that defines associated data an ^ streami methodo i ogy or protoc ol, such as 

members and methods (typically, including data hiding and Sybase Tabular Data stream (TDS) protocol or other com- 

containment), such as an object instantiated from a C++ 2Q par able streaming protocol. Streaming is modified to include 

class, a Java class, an Object Pascal class, or the like. a identifier approach of the present invention for 

One of the advantages of Java as an object-oriented supporting object serialization. A Class ID (referred hereaf- 

language over C++ is in Java's ability to flatten objects into t er a s ACI) serialization is provided as a protocol for 

a standard binary representation. This ability to flatten converting between a java object and a binary representa- 

objects allows the persisting of objects in files or databases, 25 UO n. Like Sun serialization, it operates to provide object 

or transmission of objects between applications across a serialization. Unlike Sun serialization, however, the class 

network. Because the representation is standard, applica- description required in ACI is dramatically less, thereby 

tions written by different vendors can exchange objects minimizing the time penalty and storage requirements usu- 

without having to revert to a proprietary protocol. This ally required to represent class description information in a 

standard representation was developed by Sun Microsys- 30 stream. 

terns and will be referred to herein as Sun serialization. Aa j s intended for an environment in which all classes 

Sun serialization is a protocol for converting between a eV er involved in any serialization are known by the envi- 

Java object and its binary representation. The binary repre- ronment (as is often the case). Each class known to the 

sentation is an array of bytes coded to represent the Java environment is represented by a compact numeric identifier, 

object using the Sun serialization protocol. How the Java 35 and it is this identifier alone that is used to represent the class 

object is represented within its particular host virtual runt- description in the serialization. A table of the class identifiers 

ime environment (virtual machine or VM) is irrelevant to its is kept at the beginning of each serialization. ACI is much 

binary encoding. A Java object itself is a collection of data smaller but, without further enhancement, the approach 

fields whose values are interpreted by a Java class. The Java would be at the expense of portability. In accordance with 

class of an object may specify one or more named typed 40 the present invention, however, a simple transformation is 

fields, whose values are contained in the object. Java classes applied so that any ACI serialization can be converted to a 

can be "subclassed," meaning other classes can inherit the portable serialization. 

named typed fields of a particular class, and provide addi- Class Descriptor serialization (ACD) is also provided for 
tional named typed fields. achieving portability. The ACD is identical to ACI except 
When an object is serialized using Sun serialization, a 45 that the class identifier table beginning ACI is replaced by a 
description of the object's class is serialized along with it. table of class descriptors. These class descriptors contain 
The class description is the template that allows the object virtually the same information as Sun class descriptors, so an 
to be reconstructed. Such a template allows a meaningful ACD serialization has the same portability characteristics as 
interpretation of the object's data, without which the data Sun serialization. To convert between ACI and ACD seri- 
would just be a stream of bytes. The class description 50 alizations is a very simple and computationally frugal pro- 
includes details of the class field names and types. With this C ess. Because both are otherwise identical (apart from the 
information, another goal is achieved: the description acts as class identifier tables), only the class table contents need 
versioning information. Classes can be modified over time change. The environment maintains a correspondence 
as the development process dictates, and an object serialized between the ACI class identifiers and ACD class descriptors, 
under an earlier version of a particular class must be able to 55 In this manner, the present invention provides the ability to 
"deserialized" as a newer version of the class. This would, create and stream objects, particularly Java objects, in a 
in general, be impossible without the serialization's inclu- manner which does not incur a substantial size or resource 
sion of class field names and types. When deserialization penalty. 

takes place the old class description can be compared to the nPSPRIPTinN OF THF DRAWINGS 

newer description and fields can be mapped as needed. 6 o HRlfc * DESCRIFnoN ° b mb DRAWINGS 

The inclusion of the detailed class description in the FIG. 1A is a block diagram of a computer system in which 

object serialization makes those serializations portable and the present invention may be embodied, 

"versionable." Unfortunately, however, this is done today at FIG. IB is a block diagram of a software system for 

the expense of sometimes considerable size required to controlling operation of the computer system of FIG. 1A. 

represent the descriptions. A lime penalty also results, from 65 FIG. 2 is a block diagram of a distributed computer 

the time taken to write the description. Accordingly, a better environment (which includes the computer system of FIG. 

solution is sought. 1A) in which the present invention is preferably embodied. 
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FIGS. 3A-E present flowchart illustrating an object seri- 
alization method of the present invention. 

DETAILED DESCRIPTION OF A PREFERRED 
EMBODIMENT 

The following description will focus on the presently- 
preferred embodiment of the present invention, which is 
operative in a distributed computing environment executing 
application programs which interact with remote data, such 
as that which is stored on an SQL database server. The 
present invention, however, is not limited to any particular 
application or environment. Instead, those skilled in the art 
will find that the present invention may be advantageously 
applied to any application or environment where optimiza- 
tion of object data access and processing is desirable, 
including non-SQL database management systems and the 
like. The following description is, therefore, for the purpose 
of illustration and not limitation. 
Standalone System Hardware 

The invention may be embodied on a computer system 
such as the system 100 of FIG. 1A, which comprises a 
central processor 101, a main memory 102, an input/output 
controller 103, a keyboard 104, a pointing device 105 (e.g., 
mouse, track ball, pen device, or the like), a screen display 
device 106, and a mass storage 107 (e.g., hard or fixed disk, 
removable disk, optical disk, magneto-optical disk, or flash 
memory). Processor 101 includes or is coupled to a cache 
memory 109 for storing frequently accessed information; 
memory 109 may be an on-chip cache or external cache (as 
shown). Additional output device(s) 108, such as a printing 
device, may be included in the system 100 as desired. As 
shown, the various components of the system 100 commu- 
nicate through a system bus 110 or similar architecture. In a 
preferred embodiment, the system 100 includes an IBM- 
compatible personal computer system, available from a 
variety of vendors (including IBM of Armonk, New York). 
Standalone System Software 

Illustrated in FIG. IB, a computer software system 150 is 
provided for directing the operation of the computer system 
100. Software system 150, which is stored in system 
memory 102 and on mass storage or disk memory 107, 
includes a kernel or operating system (OS) 140 and a 
windows-based GUI (graphical user interface) shell 145. 
One or more application programs, such as application 
software programs 155, may be "loaded" (i.e., transferred 
from storage 107 into memory 102) for execution by the 
system 100. The system also includes a user interface 160 
for receiving user commands and data as input and display- 
ing result data as output. 

Also shown, the software system 150 includes a Rela- 
tional Database Management System (RDBMS) front-end 
or "client" 170. The RDBMS client 170 may be any one of 
a number of database front-ends, including PowerBuilder™, 
Sybase PowerJ™, PowerC++™, Borland Paradox®, 
Microsoft® Access, or the like, and the front-end may 
include SQL access drivers (e.g., Sybase JConnect™ JDBC 
driver or the like) for accessing database tables from an SQL 
database server operating in a Client/Server environment. 
Client/Server Database Management System 

While methods of the present invention may operate 
within a single (standalone) computer (e.g., system 100 of 
FIG. 1A), the present invention is preferably embodied in a 
distributed computer environment, such as illustrated in 
FIG. 2. Distributed computing environment 200 includes 
JDBC-enabled client applications) 210 (e.g., clients 211, 
213, 215), connect to a back-end database servers) 230 
(e.g., servers 231, 233, 235) via a Database Driver archi- 
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lecture 220. Database Driver architecture 220 itself includes 
a Java SQL driver interface 221 and manager 223 (e.g., the 
Java SQL driver manager provided by Sun Microsystems), 
for managing one or more JDBC drivers, such as JDBC 

5 driver 325 (i.e., Sybase JConnect™ JDBC driver provided 
by Sybase, Inc.). In an exemplary embodiment, the Clients 
may themselves include thin-client applications (e.g., Java 
programs) running on standalone workstations, dumb 
terminals, or personal computers (PCs), such as the above - 

10 described system 100. Typically, such units would operate 
under a client operating system, such as Microsoft Windows 
9x for PC clients. Each Back End Server, such as Sybase® 
SQLServer™, now Sybase® Adaptive Server™ (available 
from Sybase, Inc. of Emeryville, Calif.) in an exemplary 

15 embodiment, generally operates as an independent process 
(i.e., independently of the Clients), running under a server 
operating system such as Microsoft Windows NT (Microsoft 
Corp. of Redmond, Wash.), NetWare (Novell of Provo, 
Utah), UNIX (Novell), or OS/2 (IBM). Here, the compo- 

20 nents of the system communicate over a network which may 
be any one of a number of conventional network systems, 
including a Local Area Network (LAN) or Wide Area 
Network (WAN), as is known in the art (e.g., using Ethernet, 
IBM Token Ring, or the like). The network includes func- 

25 tionality for packaging client calls in the well-known SQL 
(Structured Query Language) together with any parameter 
information into a format (of one or more packets) suitable 
for transmission across a cable or wire, for delivery to the 
database servers. 

30 Client/server environments, database servers, and net- 
works are well documented in the technical, trade, and 
patent literature. For a discussion of database servers and 
client/server environments generally, and SQL Server™ 
particularly, see, e.g., Nath, A., The Guide to SQL Server, 

35 Second Edition, Addison-Wesley Publishing Company, 
1995. Additional documentation of SQL Server™ is avail- 
able from Sybase, Inc. as SQL Server Documentation Set 
(Catalog No. 49600). The disclosures of each of the fore- 
going are hereby incorporated by reference. 

40 Improved serialization of Java objects 
A. Introduction 

It is known in the art to employ a communication protocol 
for effecting communication between database components, 
such as between a client front end and a database server back 

45 end. Typically, such communication protocols include native 
support for traditional SQL (e.g., ANSI SQL-92) data types, 
such as character (char), variable -length character (vchar), 
binary (blob), date- time, time stamp, together with some 
support for vendor-specific data types. 

50 One example is Sybase® "Tabular Data Stream" which 
provides a communication protocol for effecting communi- 
cation between Sybase-branded database products. Tabular 
Data Stream (TDS) is an application-level protocol used to 
send requests and responses between clients and servers. A 

55 client's request may contain multiple commands. The 
response from the server may return one or many result sets. 
TDS relies on a connection-oriented transport service. 
Session, presentation, and application service elements arc 
provided by TDS. Since TDS does not require any specific 

60 transport provider, it can be implemented over multiple 
transport protocols if they provide connection -oriented ser- 
vice. TDS provides support for login capability negotiation, 
authentication services, and support for both database spe- 
cific and generic client commands. Responses to client 

65 commands are returned using a self -describing, table- 
oriented protocol. Column name and data type information 
is returned to the client before the actual data is returned. 
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The TDS protocol is mostly a token-based protocol where 
the contents of a Protocol Data Unit (PDU) are tokenized. 
The token and its data stream describe a particular command 
or part of a result set returned to a client. For example, there 
is a token called TDS_LANGUAGE which is used by a 
client to send language, typically SQL, commands to a 
server. There Ls also a token called TDS_ROWFMT which 
describes the column name, status, and data type which is 
used by a server to return column format information to a 
client. The TDS protocol is half -duplex. A client writes a 
complete request and then reads a complete response from 
the server. Requests and responses cannot be intermixed and 
multiple requests cannot be outstanding. 

A TDS request or response may span multiple PDUs. The 
size of the PDU sent over the transport connection is 
negotiated at dialog establishment time. Each PDU contains 
a header, which is usually followed by data. A PDU header 
contains information about the size and contents of the PDU 
as well as an indication if it is the last PDU in a request or 
response. 

As an illustration of this protocol, consider, for example, 
the SQL statement: "select name from sysobjects where 
id<3". The following will illustrate a high-level description 
of the TDS tokens exchanged by a client and a server to 
establish a dialog and then execute a simple SQL query. The 
query causes two table rows to be returned to the client. The 
client first requests a transport connection to the server and 
then sends a login record to establish a dialog. The login 
record contains capability and authentication information. 



Client 




Server 


login packet 










TDS_LOGINACK 






TDS_DONE 









Now that a dialog has been established between the client 
and the server, the client sends the SQL query to the server 
and then waits for the server to respond. 



Client 


Server 


LANGUAGE: "select name. . 





The server executes the query and returns the results to the 
client. First the data columns are described by the server, 
followed by the actual row data. A completion token follows 
the row data indicating that all row data associated with the 
query has been returned to the client. 



Client 




Server 






TDS_ROWFMT row description 






TDS_ROW row data 






TDS_ROW row data 






TDS_DONE 



Although the above-described communication protocol is 
employed in the preferred embodiment, the present inven- 
tion may be implemented using any comparable data stream- 
ing protocol. Regardless of the communication protocol 
employed, the protocol is extended in accordance with the 
present invention to support object-based data types, such as 
Java objects, thus allowing these objects to become full class 
SQL objects. 
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B. Support for Storing Java Objects as Column Data in a 
Table 

1. General 

SQL databases may be modified to directly store Java 
5 objects as column data in a database. In this manner, the user 
is able to create queries (e.g., in SQL) that has predicates that 
refer to individual objects, including their individual fields 
and methods, in an extended form of SQL. In a database, 
Java classes are treated as data types, and a column can be 
10 declared with a Java class as its data type. The corresponding 
JDBC access driver supports storing Java objects in a 
database by implementing setObject( ) methods and 
getObject( ) methods. This makes it possible to use the 
JDBC driver with an application that uses native JDBC 
15 classes and methods to directly store and retrieve Java 
objects as column data. The following describes the require- 
ments and procedures for storing objects in a table and 
retrieving them using the JDBC driver in the system of the 
present invention. 

20 

2. Prerequisites for Storing Java Objects As Column Data 
In order to store Java objects belonging to a user-defined 

Java class in a column, the following requirements should be 
met. First, the class should implement the java.io.Serializ- 

25 able interface. This is because the JDBC driver in a preferred 
embodiment employs the native Java serialization and dese- 
rialization to send objects to a database and receive them 
back from the database. Second, the class definition should 
be installed in the destination database. Finally, the client 

3Q system should have the class definition in a class file that is 
accessible through the local CLASSPATH environment vari- 
able. 

3. Sending a Java Object to a Database 

To send an instance of a user-defined class as column data, 
35 one employs one of the following setObject( ) methods, as 
specified in the PreparedStatement interface: 
void setObject(int parameterlndex, Object x, int 

targetSqlType, int scale) throws SQLException; 
void setObjecl(int parameterlndex, Object x, int 
40 targetSqlType) throws SQLException; 

void setObject(int parameterlndex, Object x) throws 

SQLException; 
The following example defines an Address class, shows the 
definition of a "Friends" table that has an Address column 
45 whose data type is the Address class, and inserts a row into 
the table. 



50 public class Address implements Serial izable 

{ 

public String streetNumber; 
public String street; 
public String apartmentNumber; 
public String city; 
55 public int zipCode; 
//Methods 

>" 

Friends table: 
varchar (30) firstname, 
varchar (30) lastname, 
Address address, 
varchar (15) phone) 

// Connect to the database containing the Friends table. 
Connection conn - 

DriverManager.getConnection("jdbc:sybase:Tds:IocaIhost:5000", 

"user name", "password"); 
65 H Create a Prepared Statement object with an insert statement 
//for updating the Friends table. 
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-continued 



-continued 



PrcparcdStatcmcnt ps - conn.prepareStatement("INSERT INTO Friends 
values (?,?,?,?)"); 

// Now, set the values in the prepared statement object, ps. 5 

// set fir st name to "Joan." 

ps.setString(l, "Joan"); 

// Set last name to "Smith." 

ps.setString(2, "Smith"); 

// Assuming that we already have "Joan_address" as an instance 

// of Address, use setObjectfint parameterlndex, Object x, int // 10 

targetSqlType) to set the address column to "Joan_address." 

// Note that the targetSqlType is java.sgLtypes.JAVA_OBJECT, with a // 

designated integer value of "2000." 

ps.setObject(3, Joan_address, 2000); 

// Set the phone column to Joan's phone number. 

ps.setString(4, "123-456-7890"); 3 5 
// Perform the insert 
ps.executeupdate( ); 



4. Receiving a Java Object from the Database 
A client JDBC application can receive a Java object from 
the database in a result set or as the value of an output 
parameter returned from a stored procedure. If a result set 
contains a Java object as column data, one may employ the 
following getobject( ) methods in the ResultSet interface to 
assign the object to a class variable. 
Object getObject(int columnlndex) throws SQLException; 
Object getObject(String columnName) throws SQLExcep- 
tion; 

If an output parameter from a stored procedure contains a 
Java object, one may employ the following getObject( ) 
method in the CallableStatcment interface to assign the 
object to a class variable. 

Object getObject(int parameterlndex) throws SQLExcep- 
tion; 

The following example illustrates the use of 
ResultSet .getObject(int columnlndex) to assign an object 
received in a result set to a class variable. The example uses 
the Address class and Friends table of the previous section 
and presents a simple application that prints a name and 
address on an envelope. 



/* 

* * This application takes a first and last name, gets the 

** specified person's address from the Friends table in the 

* * database, and addresses an envelope using the name and 

* * retrieved address. 
*/ 

public class Envelope 
{ 

Connection conn - null; 
String firstName - null; 
String last Name - null; 
String street - null; 
String city - null; 
String zip - null; 

public static void main(String( ] args) 
{ 

if (args. length <2) 
{ 

System.out.prin tin ("Usage: Envelope <firstNamc> 

<lastName>"); <las tNa mo"); 
System.exit(l); 

} 

// create a 4" x 10" envelope 
Envelope e » new Envelope(4, 10); 
try 
{ 

// connect to the database with the Friends table, 
conn - DriverManager.getConnection( 



25 



35 



45 



50 



"jdbcsybaserTdsUocalhostiSOOO", "username", 

"password"); 
// look up the address of the specified person 
firstName » args[0]; 
lastName ° args[l]; 

PreparedStatemenl ps - conn.prepareStatement( 

"SELECT address FROM friends WHERE " + 
"first name - ? AND lastname - ?"); 
ps.se tString(l, firstName); 
ps.se tString(2, lastName); 
ResultSet rs = ps.executeOuery( ); 
if (rs.next( )) 
{ 

Address a - (Address) rs.getObject(l); 

// set the destination address on the envelope 

csetAddress (firstName, lastName, a); 

} 

conn.close( ); 

} 

catch (SQLException sqe) 
{ 

sqe.printStackTrace( ); 
System.exit(2); 

} 

// if everything was successful, print the envelope 

e.print( ); 

} 

private void setAddress (String fname, String lname, Address a) 
{ 

street - a.streetNumber + " 44 + a. street + " " + 

a.apartmentNumber; 

city - a. city, 

zip » *' " + a.zipCode; 

} 

private void print( ) 
{ 

// Print the name and address on the envelope. 

} 
} 



C. Class Descriptor-based Serialization 

1. Class ID Serialization 

As described above, the inclusion of the detailed class 
40 description in the object serialization makes those serializa- 
tions portable, and versionable at the expense of the some- 
times considerable size required to represent the descrip- 
tions. A time penalty also results, from the time taken to 
write the description. 

In accordance with the present invention, a class identifier 
approach is introduced for supporting object serialization. A 
Class ID (referred hereafter as ACI) serialization is provided 
as a protocol for converting between a java object and a 
binary representation. Like Sun serialization, it operates to 
provide object serialization. Unlike Sun serialization, 
however, the class description required in ACI is dramati- 
cally less. 

ACI is intended for an environment in which all classes 
ever involved in any serialization are known by the envi- 

55 ronment (as is often the case). Each class known to the 
environment is represented by a compact numeric identifier, 
and it is this identifier alone that is used to represent the class 
description in the serialization. A table of the class identifiers 
is kept at the beginning of each serialization. ACI is much 

60 smaller but, without further enhancement, the approach 
would be at the expense of portability. In accordance with 
the present invention, however, a simple transformation is 
applied so that any ACI serialization can be converted to a 
portable serialization. 

2. Class Descriptor Serialization 

Class Descriptor serialization (ACD) is identical to ACI 
except that the class identifier table beginning ACI is 
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replaced by a table of class descriptors. These class descrip- 
tors contain virtually the same information as Sun class 
descriptors, so an ACD serialization has the same portability 
characteristics as Sun serialization. To convert between ACI 
and ACD serializations is a very simple and computationally 
frugal process. Because both are otherwise identical (apart 
from the class identifier tables), only the class table contents 
need change. The environment maintains a correspondence 
between the ACI class identifiers and ACD class descriptors. 
3. Grammar of Serialization 

Rules are provided for object serial representation as 
shows below, using (typical) grammar notation. 
Rule 1: [X] indicates that X is optional 
Rule 2: [X ... ] indicates 0 or more occurrences of X 
Rule 3: X [. . . ] indicates 1 or more occurrences of X 
Rule 4: X |Y indicates X or Y must occur 
Rule 5: Ox is used to precede a hexadecimal literal value 
Rule 6: (datatype) indicates the java type of the following 

token 

Rule 7: *C indicates an ASCII character literal 

Rule 8: (utf8) indicates a UTF8 string encoding 

Using the above notation, the following object serialization 

fields are defined. 



object-serialization: object-serial-type classdesc-table object-table 
serial-type: serial -type-class id- header | serial- type-classdesc-headcr 
serial-type-classid-header: (byte) 0x20 
serial-type-classdesc-header: (byte) OxCO 
classdesc-table: classid [...] null-classid 

| classdesc [...] null-classdesc 
classid: compact- in t 
null-classid: (byte) 0x0 
object-table: object [...] null-object 
null-object: null-classid 

object: class-object | simple- object | array-object 
class-object: proxy-classid proxy-classid 
simple-object: object-piece [...] 
object-piece: proxy-classid object-piece-data 
object-piece-data: field-data [...] 
field-data: primitive- field-data 

| object-proxyid 
primitive- field-data: boolean- fie ld-data 
| char- field-data 
j byte-field-data 
| short- field-data 
| int- field -data 
j float-field-data 
| long- field-data 
j double-field-data 
boolean- fie ld-data: (boolean) 
char-field-data: (char) 
byte-field-data: (byte) 
short- field-data: (short) 
int- field-data: (int) 
float-field-data: (float) 
long-field-data: (long) 
double-field-data: (double) 
array-object: primitive-array-object 

| object- array-object 
primitive-array-object: primitive-type array-size primitive-array-data 
primitive- type: (byte) 0x5 // boolean 
' ~ II char 

// float 
// double 
//byte 
// short 
// int 
// long 
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| (byte) 0x6 
j (byte) 0x7 
| (byte) 0x8 
| (byte) 0x9 
| (byte) OxA 
j (byte) OxB 
j (byte) OxC 
array-size: compact- int 
primitive-array-data: boolean-array-data 
| char- array-data 
j byte-array-data 
j short-array-data 
j int-array-data 
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j float-array-data 

j long-array -data 

j doublc-array-data 
boolean-array-data: boolean-field-data [...] 
char- array-data: char-field-data [..,] 
byte- array-data: byte-field-data [...] 
short-array-data: short-field-data [..,] 
int-array-data: int- field-data [...] 
float-array-data: float- field-data [...] 
long-array-data: long- fie ld-data [..,] 
double-array-data: double-field-data [...] 

object-array-object: object-type object- array-class_signature array-size 

object-array-data 

object-type: (byte) 0x1 

object-array-class„signature: '[' [...] { primitive-signature 

J 'L' proxy-classid } '[0]' 
primitive-signature: 'Z' // boolean 

| 'C II char 

j *F // float 

j 'D' // double 

I 'B' // byte 

I 'S' // short 

| T // int 

| 'J' // long 
object-array-data: object-proxyid [...] 
classdesc: classdesc-serial-type class-name class -flags 
total- class -members data-member [...] 
class- name: (utffi) 
member-name: (ut£8) 

data-member: member-name { primitive-data-member j object- 
data- member } 

primitive-data-member: primitive- type 
object-data-member: object- type object-class-name 
classdesc-serial-type: 0x80 
null-classdesc: classdesc-serial-type 70' 



The compact-int rule is used to indicate a format for 
storing numbers efficiently. As will be explained, proxy- 
classid and object-classid will, in general, correspond to 
relatively small numbers; however, there is no limit to how 
much they can grow. Using a fixed size (e.g., four bytes) for 
storing these identifiers would entail wasted space for the 
normally small identifiers. Using a smaller size (e.g., two 
bytes), on the other hand, would impose constraints on the 
maximum size of a serialized object. Instead, all identifiers 
are stored as compact numbers. In a compact-int, each byte 
in the quantity uses seven bits to represent the number, with 
the other bit set when there exists a following byte. A method 
is defined in the following pseudo-code. 



while( N > 0 ) { 

if( (N & ~0x7F) I- 0 ) { 

WriteByte( (N& 0x7f) | 0x80 ) 
}else{ 

WritcBytct N & 0x7F ) 

} 

N - N » 7; 



} 
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Now, to understand a method for improved object 
serialization, assume the following notation. Let O be the 
object being serialized. Let R(O) be the set of all objects 
reachable from O. Let C(0) be the set containing the class 
and superclasses of O. If O is in fact a class object, then C(O) 
contains O, as well. Let C(R(0))«{C(o) for each o in R(O)}. 
Let clid(C) be the class id of class C. Let proxy(C) be the 
proxy id of class C. The proxy id of a class is its ordinal 
position within the class table plus 0x10. The addition of 
0x10 is to allow distinguishing from the primitive-array- 
types. Let proxy(O) be the proxy id of object O. The proxy 
id of an object is its ordinal position within the object table. 
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An improved method of object serialization may be database are known, so the class identifiers needed by ACI 

summarized by the following method steps, as illustrated in can be and are maintained within the database. 

FIGS. 3A-E. At the outset, at step 301, a byte representing Consider, for instance, the following simple class, 
the type of serialization is written, serial-type-classid- 

header. Next, at step 302, the classid-table is written using 5 

the following logic: for each class C in C(R(0)), the classid "" "~ " ~" ; " ! . „ . ,. L . r 

e r> i- * • w i_ ^ • j public class MyClass implements java.io.Scnalizablc { 

ofC, chd(C) is written. For each C, a proxy(C) is generated, int fic|dl . 1 

whose value is the position in the classid table, starting at int fieid2; 

position 1. The classid table is terminated by a null classid, public MyCiass( int fl, int f2 ) { 

0. 10 ™f I £ 

Now, for each object o in R(O), the object may be j 

streamed out as follows. As shown at step 303, the method } 

switches (i.e., branches) based on object type. If o is a 

primitive array, the method branches to step 311. to apply the A . . c fcJf ™ . - , „ . .. 4 . 

£ *\ . . , 1 n J* m An instance of MyClass serialized using Sun serialization 

following substeps (substeps 3U«, shown in FIG. 3B). is fequires 54 bytes ^ same instance ^ ^ via Aa ^ lfi 

(a) Wnte primitive array type; bytes, and by ACD is 36 bytes. These results are summarized 

(b) Write the size of the array; and by the following Table 1. 



20 



TABLE 1 


TEST RESULTS 




Serialization methodology 


Size 


Sun serialization 


54 bytes 


ACI serialization 


16 bytes 


ACD serialization 


36 bytes 



(c) For each primitive element of o, write the element 
beginning with the Oth element. 
If, on the other hand, o is an object array, the method 
branches to step 312, to apply the following substeps 
(substeps 312a~d f shown in FIG. 3C). 

(a) Write object array type; 

(b) Write the signature of the array; 25 

(c) Write the size of the array; and 

(d) For each object element p of o, write proxy(p), or 0 if 

p is null, beginning with the Oth element. E " ^ability 

If o is a class object, the method branches to step 313, to Although ACI is very efficient to use for storing Java 

apply the following substeps (substeps 313a-6, shown in 30 ol * ects W1 * in a closed environment like a database, there 

FIG 3D 1 * are cases where an object leaves the confines of the database, 

" . y , and hence there is a necessity for a portable format. The best 

(a) Wnte proxy( class(o)); and example of the need for such a portable format is database 

(b) Write proxy(o). replication. In replication it is necessary to transfer data from 
Otherwise, the method branches to step 314, to apply the 35 one database to another. Since class identifiers are unique 
following substeps (substeps 314fl-£>, shown in FIG. 3E). only to a particular database, class identifiers in one database 

(a) Write the proxy( class(o))- will likely not correspond to the same classes in another 

(b) For each class or superclass class(i,o) of o, starting d f taba f " V*. A f D feriaHzation provides the necessary 
x t tU * i ■ i i class description to allow portability. 

from the most derived class: ™ , , c ,. \. , J 

„ , . . . r 1Jf c , / . . The problem of replication has some other interesting 

For each senahzable fieldf of class(i,o): 40 characteristics> Replication fe concerned with syncing data 

Iffisapnmitivetypedfield,wntethecorresponding base data across multiplc data5ases> so replication^ most 

value in o; Jjei ,™ r °ft en J 1 ^ 1 replicates database changes. These changes are 

Otherwise, f must be an object typed field. Therefore, most efficiently drawn from the database log file. In fact, 

let p be the corresponding object value in o. If p replication does not even need to communicate with the 

is null, write 0, else write proxy(p). 45 database engine. Replication needs to only understand the 

The method concludes by terminating the object- table with log file. 

the null proxyid, 0, as shown by step 304 (FIG. 3A). In the system of the present invention, the system's log 

D. Practical use and test results file stores Java object serializations in ACI format, but 

The SQL employed in a DBMS may be extended to allow replication requires them in ACD format. This is where the 

the installation of Java classes into a database. For instance, 50 class table beginning ACI and ACD are particularly advan- 

the database engine Sybase Adaptive SQL Anywhere (ASA) tageous. Class descriptors are also stored in the log file, so 

includes a Java VM, thus allowing Java to be invoked from as a replication process scans a log file, it builds up a list of 

SQL and run in the context of the database engine. In known class descriptors with their corresponding class 

addition, database table columns can be created with type identifiers, and replaces the ACI class table with a ACD class 

corresponding to Java types, allowing the storage of Java 55 table in every Java object serialization. Hence, without even 

objects in the database. Database data is generally saved in requiring a running Java VM, Java objects can be easily 

persistent stores, so ASA may store its Java objects in the transformed from one format to the other, 

database using a serialization of the object. While the invention is described in some detail with 

A compact serialization for storing Java objects was specific reference to a single preferred embodiment and 

preferred. Database data is generally kept as compact as 60 certain alternatives, there is no intent to limit the invention 

possible within reason. Clearly, the amount of compaction to that particular embodiment or those specific alternatives, 

must be weighed against the time required to do the com- Thus, the true scope of the present invention is not limited 

p acting. Compact data leads to less space required to store to any one of the foregoing exemplary embodiments but is 

the data, and less I/O time required to read and write the instead defined by the appended claims, 

data. The absence of class descriptor information makes ACI 65 What is claimed is: 

a much more compact serialization than Sun. Within the 1. In a system comprising a computer network having a 

database environment, all classes ever installed into the database server and a client, an improved method for allow- 
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ing a client to retrieve an object stored in a database table 
residing on a database server, the method comprising: 

providing a streaming protocol for transferring objects 
from the database server to the client; 

receiving from the client a request for serialization of a 5 
particular object for transferring the particular object 
from the database server to the client, wherein said 
particular object is a Java object comprising at least one 
class, and wherein said particular object is stored in a 
relational database table at the database server; 

in response to the request, creating a class identifier for 
uniquely identifying each class from which the particu- 
lar object is derived that is already known to the 
system, thereby supporting conversion of the particular 
object to and from a binary representation without 
transmitting class descriptor information; 

creating a serialization comprising a binary representation 
of the particular object suitable for streaming 
transmission, said serialization including a table of said 2 o 
class identifiers for the particular object; 

streaming the binary representation of the particular 
object from the database server to the client; and 

upon receipt of the streamed binary representation at the 
client, recreating at the client a copy of said particular 25 
object. 

2. The method of claim 1, further comprising converting 
said serialization into a portable serialization by: 

creating a class descriptor for each class from which the 
particular object is derived, for providing detailed class 
description information in the object serialization for 
making the serialization portable; 

for each class identifier of a given class, specifying a 
correspondence between the class identifier of the 35 
given class and a class descriptor for that class, wherein 
said class descriptor comprises information for con- 
verting the particular object to and from a binary 
representation when the given class is unknown to the 
system; and 40 

transforming said serialization into a portable serializa- 
tion by replacing said table of class identifiers with a 
suitable table of class descriptors. 

3. The method of claim 2, wherein said objects comprise 
Java objects and wherein said portable serialization com- 
prises Java-compatible serialization. 
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4. The method of claim 2, wherein said table of class 
identifiers requires substantially less storage than said table 
of class descriptors. 

5. The method of claim 2, wherein the correspondence 
between a class identifier of a given class and a correspond- 
ing class descriptor for that class is maintained by the 
system. 

6. The method of claim 1, wherein each class identifier 
comprises a numeric identifier. 

7. The method of claim 1, wherein each class identifier 
comprises a compact numeric identifier comprising a quan- 
tity of at least one byte value. 

8. The method of claim 7, wherein said compact numeric 
identifier comprises a variable-length numeric identifier 
wherein each byte of the identifier uses seven bits to 
represent a number quantity and one bit to indicate whether 
an additional byte follows for the identifier. 

9. The method of claim 1, wherein said objects comprise 
Java objects derived from Java classes. 

10. The method of claim 1, further comprising: 
storing said serialization in a database table at the data- 
base server. 

11. The method of claim 1, wherein said request com- 
prises an SQL query received from the client. 

12. Hie method of claim 1, wherein said Java object 
includes instantiated Java class data members and class 
methods. 

13. The method of claim 1, wherein said client comprises 
a database application executing at a client machine. 

14. The method of claim 1, wherein said protocol com- 
prises a token-based protocol. 

15. The method of claim 1, wherein said particular object 
comprises a Java object stored as column data in a database 
table of the database server. 

16. The method of claim 1, wherein said serialization 
includes at its beginning said table of said class identifiers 
for the particular object. 

17. The method of claim 1, wherein said system maintains 
a table of classes known to the system. 

18. The method of claim 17, wherein a class identifier for 
a given class is created, at least in part, by basing the class 
identifier on an ordinal position of the given class in said 
table of classes. 
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